KMID : 0381120210430091059
|
|
Genes and Genomics 2021 Volume.43 No. 9 p.1059 ~ p.1064
|
|
Enhancing performance of gene expression value prediction with cluster-based regression
|
|
Seok Ho-Sik
|
|
Abstract
|
|
|
Background: The inherent correlations among gene expressions have received attention. Recently, it was reported that a set of approximately 1000 landmark genes can be utilized for prediction of expression of other genes (target genes).
Objective: The objective of this study is to predict expression values of target genes based on expression values of landmark genes.
Methods: A cluster-based regression method is proposed. In the proposed method, clusters are obtained from a set of training instances of a gene and an estimator is obtained per cluster. A test instance is assigned to one of clusters then a regression model corresponding to the cluster predicts expression value.
Results: Performance of the proposed method is measured on the GEO (Gene Expression Omnibus) expression data and the GTEx (Genotype-Tissue Expression) expression data. In terms of mean absolute error averaged across target genes, the proposed method significantly outperforms previous approaches in the case of the GEO expression data.
Conclusions: The experimental results report that the combination of clustering and regression can outperform the state-of-the art methods such as generative adversarial networks and a gradient boosting based method.
|
|
KEYWORD
|
|
Clustering, Gene expression value prediction, Kernel ridge regression, Landmark gene, Performance enhancing, Regression
|
|
FullTexts / Linksout information
|
|
|
|
Listed journal information
|
|
|